HTTP request callback support by SteveSandersonMS · Pull Request #1689 · github/copilot-sdk

SteveSandersonMS · 2026-06-16T18:23:39Z

Summary

This PR adds SDK support for intercepting HTTP requests (for LLM inference or anything else) and handling them in user code across all six SDK languages: Node.js, .NET, Python, Go, Rust, and Java.

Consumers register one client-global CopilotRequestHandler (constructed once, no args). The runtime invokes it over JSON-RPC (llmInference.*) whenever it would otherwise issue a model-layer HTTP or WebSocket request — for both BYOK and CAPI — fully replacing the outbound call. A handler that overrides nothing is a transparent pass-through.

It includes the full feature work on this branch:

wire up LLM inference provider registration and generated RPC types
add the raw chunked callback protocol for outbound inference requests and responses
cover plain HTTP, streaming responses, cancellation (runtime- and consumer-initiated), error mapping, session-id threading, and WebSocket transport
port the feature to every SDK language with idiomatic HTTP types
keep the public callback surface to a single CopilotRequestHandler model with forwarding helpers

What changed

Shared protocol and plumbing

add generated RPC / session event types needed for LLM inference callbacks across all SDK surfaces
add SDK-side registration for a process-global LLM inference provider (llmInference.setProvider)
route outbound inference requests through the callback bridge instead of requiring provider-specific hooks

Per-language ports

Each language exposes the same CopilotRequestHandler model, mapped onto the most canonical HTTP representation available in that ecosystem:

Language	HTTP request/response type	WebSocket type
Node.js	`Request` / `Response` (Fetch)	per-connection handler
.NET	`HttpRequestMessage` / `HttpResponseMessage`	per-connection handler
Python	`httpx` request/response	per-connection handler
Go	`http.Request` / `http.Response`	per-connection handler
Rust	`http::Request` / `http::Response`	per-connection handler
Java	`java.net.http` `HttpRequest` / `HttpResponse`	per-connection handler

All ports thread cancellation and session id through the request context, and provide a CopilotWebSocketForwarder (or language equivalent) for the common mutate-and-forward case.

API shape

collapse HTTP interception to a single send hook (SendRequestAsync / sendRequest / send_request / language equivalents)
expose WebSocket interception through OpenWebSocketAsync / openWebSocket / open_websocket, returning a per-connection handler object
allow consumers to mutate, drop, duplicate, or fully replace request/response messages while keeping the common forwarding case straightforward

Usage examples

C#

using GitHub.Copilot;
using System.Net.Http;

sealed class MyHandler : CopilotRequestHandler
{
    protected override async Task<HttpResponseMessage> SendRequestAsync(
        HttpRequestMessage request,
        CopilotRequestContext ctx)
    {
        request.Headers.Add("X-Debug-Session", ctx.SessionId ?? "none");
        return await base.SendRequestAsync(request, ctx);
    }

    protected override Task<CopilotWebSocketHandler> OpenWebSocketAsync(CopilotRequestContext ctx)
        => Task.FromResult<CopilotWebSocketHandler>(new MyForwardingSocket(ctx));
}

sealed class MyForwardingSocket : CopilotWebSocketForwarder
{
    public MyForwardingSocket(CopilotRequestContext ctx)
        : base(ctx)
    {
    }

    public override Task SendRequestMessageAsync(CopilotWebSocketMessage message)
    {
        var text = message.GetText().Replace("model-A", "model-B");
        return base.SendRequestMessageAsync(CopilotWebSocketMessage.Text(text));
    }
}

Register the handler when constructing the client:

var client = new CopilotClient(new CopilotClientOptions { RequestHandler = new MyHandler() });

Node.js

import {
    CopilotRequestContext,
    CopilotRequestHandler,
    CopilotWebSocketForwarder,
} from "@github/copilot";

class MyHandler extends CopilotRequestHandler {
    protected override async sendRequest(request: Request, ctx: CopilotRequestContext): Promise<Response> {
        const headers = new Headers(request.headers);
        headers.set("x-debug-session", ctx.sessionId ?? "none");

        return super.sendRequest(new Request(request, { headers }), ctx);
    }

    protected override async openWebSocket(ctx: CopilotRequestContext) {
        return new MyForwardingSocket(ctx);
    }
}

class MyForwardingSocket extends CopilotWebSocketForwarder {
    override sendRequestMessage(data: string | Uint8Array) {
        if (typeof data === "string") {
            return super.sendRequestMessage(data.replace("model-A", "model-B"));
        }
        return super.sendRequestMessage(data);
    }
}

Java

import com.github.copilot.CopilotRequestContext;
import com.github.copilot.CopilotRequestHandler;
import com.github.copilot.CopilotWebSocketHandler;
import com.github.copilot.CopilotWebSocketMessage;
import com.github.copilot.CopilotWebSocketForwarder;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.io.InputStream;
import java.nio.charset.StandardCharsets;

final class MyHandler extends CopilotRequestHandler {
    @Override
    protected HttpResponse<InputStream> sendRequest(HttpRequest request, CopilotRequestContext ctx) throws Exception {
        HttpRequest mutated = HttpRequest.newBuilder(request, (n, v) -> true)
                .header("X-Debug-Session", ctx.sessionId() == null ? "none" : ctx.sessionId())
                .build();
        return super.sendRequest(mutated, ctx);
    }

    @Override
    protected CopilotWebSocketHandler openWebSocket(CopilotRequestContext ctx) {
        return new CopilotWebSocketForwarder(ctx) {
            @Override
            public void sendRequestMessage(CopilotWebSocketMessage message) throws Exception {
                String text = message.text().replace("model-A", "model-B");
                super.sendRequestMessage(CopilotWebSocketMessage.text(text));
            }
        };
    }
}

Register the handler when constructing the client:

CopilotClient client = new CopilotClient(
    new CopilotClientOptions().setRequestHandler(new MyHandler()));

Tests

Each language adds e2e coverage (mirroring a shared reference suite) for:

callback provider registration
HTTP inference interception
streaming inference interception
error mapping
runtime-initiated and consumer-initiated cancellation
session-id threading
WebSocket callback handling
an idiomatic handler test exercising mutate-and-forward over both HTTP and WebSocket

Resolves github/copilot-sdk-internal#88

github-advanced-security

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Adds an opt-in llmInference config to CopilotClientOptions that lets SDK consumers register a callback the runtime invokes whenever it would otherwise issue an outbound non-streaming LLM HTTP request itself. v1 scope is TS-only/non-streaming, mirroring the runtime support added in github/copilot-agent-runtime. Streaming SSE and WebSocket transports are out of scope for v1 and continue to bypass the callback. - New `LlmInferenceProvider` interface with a single `onLlmRequest` method. - `createLlmInferenceAdapter` converts the provider into the wire-shape `LlmInferenceHandler` consumed by the RPC dispatcher. - Client wiring: `llmInference.setProvider` is sent on connect; per-session adapter is attached alongside the existing sessionFs hook. - New `llm_inference.e2e.test.ts` exercises the full RPC round-trip against the runtime. Resolves github/copilot-sdk-internal#88 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Matches the runtime move of `llmInference.httpRequest` out of the session-scoped client API and onto a new `clientGlobal` schema root. - Codegen emits a new `registerClientGlobalApiHandlers` alongside the existing `registerClientSessionApiHandlers`. Handlers passed to it are dispatched directly (no per-session `getHandlers` callback) and carry no implicit sessionId — sessionId, when present, is just a payload field on the call. - `CopilotClient` now constructs the LLM inference adapter once and registers it process-wide via `registerClientGlobalApiHandlers` during connection setup. The per-session `setupLlmInference` path and the `SessionConfigBase.createLlmInferenceProvider` override are removed — there is no longer any per-session notion of which provider to use. - `LlmInferenceConfig.createLlmInferenceProvider` is now `() => LlmInferenceProvider` (was `(session) => ...`). - `LlmInferenceRequest` exposes the new optional `sessionId` field so consumers can correlate requests with a runtime session when one is in scope. E2E test updated to verify the global registration works and that sessionId is populated on in-session traffic. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

With the Rust runtime intercept chokepoint in place, every model-layer HTTP request - including /models and /models/session - is now dispatched through the SDK callback. Update the e2e test to: - Stub realistic responses for non-streaming model catalog and session endpoints (so the runtime can proceed past model resolution). - Hard-assert the catalog request is intercepted (no more 'either-or' fallback for the pre-rust-intercept state). Streaming inference requests still pass through to the recorded CAPI proxy; a fully-mocked end-to-end inference test will land alongside the streaming-intercept commit. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Extends LlmInferenceProvider with an optional onLlmStreamRequest method that returns a response head synchronously and pushes body chunks via the provided sink. The adapter implements the generated httpStreamStart RPC method and forwards chunks back to the runtime via the typed server-RPC client (llmInference.streamChunk / streamEnd). Adds a fully-mocked e2e test (test/e2e/llm_inference_stream.e2e.test.ts) that drives a complete user->assistant turn through the callback alone: the runtime hits the callback for /models, /models/session, and the chat completion itself, the assistant text returned to the SDK consumer is the synthetic text supplied by the stub. - nodejs/src/llmInferenceProvider.ts: LlmInferenceStreamSink, onLlmStreamRequest, httpStreamStart adapter - nodejs/src/client.ts: pass a lazy server-RPC accessor into the adapter - nodejs/src/index.ts: re-export new types - nodejs/test/e2e/llm_inference_stream.e2e.test.ts: full-mock e2e - nodejs/src/generated/*, python/*, go/*, rust/*: codegen for new RPC methods - dotnet/src/Generated/*: codegen for new RPC methods Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Adds test/e2e/llm_inference_errors.e2e.test.ts that wires a callback whose inference handler throws a synthetic transport error and verifies the failure surfaces to the SDK consumer (the call does not hang and any error caught is non-empty). Confirms the runtime's existing retry / error reporting path handles callback-side failures the same way it handles real transport failures. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Mirrors the runtime-side cleanup: the callback wire no longer carries providerType / endpointKind / wireApi / transport / modelId. Adapter stops forwarding the field, e2e tests filter by URL instead of metadata, and the missing LlmInferenceStreamSink / LlmInferenceStreamStartResponse re-exports in types.ts are added so index.ts type-checks cleanly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

[Phase 3] Realign the Node SDK with the runtime's new four-method chunk protocol. One unified provider callback: interface LlmInferenceProvider { onLlmRequest(req: LlmInferenceRequest): Promise<void>; } LlmInferenceRequest exposes: * url / method / headers / sessionId * requestBody: AsyncIterable<Uint8Array> // body delivered as chunks * responseBody: LlmInferenceResponseSink // start/write/end/error The sink enforces start -> 0..N writes -> exactly one of end/error and maps each call to the corresponding httpResponseStart / httpResponseChunk RPC. createLlmInferenceAdapter maintains a per-requestId state map; the generated httpRequestStart handler registers state synchronously and fires onLlmRequest in the background, so the runtime's RPC reply isn't gated on consumer I/O. The body queue iterator now latches a 'done' flag so a consumer that calls .next() again after end:true gets done back instead of blocking forever waiting for chunks the runtime will never send. Removes the previous onLlmRequest + onLlmStreamRequest split and the LlmInferenceResponse / LlmInferenceStreamSink / LlmInferenceStreamStartResponse public types. All three e2e tests rewritten against the unified callback (one of them URL-dispatches /responses -> SSE and /chat/completions -> buffered JSON; the consumer can also branch on whether the request body has stream:true). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Phase 4.1: expose an AbortSignal on the request envelope, abort it on a cancel chunk from the runtime, and map consumer-side aborts to a 499 + error{code:cancelled} response. Adds the cancellation e2e test. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add an e2e test asserting that when the SDK consumer signals a terminal error via responseBody.error({ code: 'cancelled' }) the runtime surfaces it faithfully as a request failure rather than hanging. Completes the consumer->runtime direction of Phase 4.1. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Surface the new `transport` discriminator on `LlmInferenceRequest` so consumers can tell an `"http"` request (plain HTTP / SSE) from a `"websocket"` one (full-duplex: each request-body chunk is one inbound WS message, each response-body write one outbound message). The adapter threads `params.transport` through, defaulting to `"http"`. Regenerate rpc.ts against the runtime schema for the new field and add an e2e test exercising the full-duplex path: the fake model advertises `ws:/responses`, the runtime's WebSocket flag is enabled via env var, and the consumer pumps `/responses` events back per inbound message. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Friendly product-code starting point for SDK consumers who want to observe or mutate LLM inference requests/responses by overriding virtual methods on a base class. Implements LlmInferenceProvider, so an instance can be returned directly from createLlmInferenceProvider. Default behaviour is a transparent pass-through: each request is forwarded to its original URL via the WHATWG fetch global (HTTP) or WebSocket global (WebSocket), and the upstream response is streamed back unchanged. The same subclass handles both transports - onLlmRequest dispatches on req.transport. Virtual hooks: - HTTP: transformRequest, forward, transformResponse - WebSocket: forwardWebSocket, transformRequestMessage, transformResponseMessage E2e test (llm_inference_handler.e2e.test.ts) demonstrates a single TestHandler subclass servicing both an HTTP turn (single-shot title generation) and a WebSocket turn (main agent turn) against a per-test in-process http+ws upstream that speaks the real CAPI shapes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Review fixes for github/copilot-sdk-internal#88 (Node SDK side). - Honor the runtime's accepted=false ack: the response sink now aborts the provider's signal and stops emitting once the runtime drops the request (I1). - Add a staging backstop in the adapter so a body chunk that arrives before its start frame is buffered and replayed rather than silently dropped (B1). - Run the WebSocket request/response pumps concurrently and race their terminal states, so an upstream-closes-first (or runtime-cancels-first) case tears the other side down instead of hanging on a parked iterator (B2). - Buffer inbound WS frames in wrapGlobalWebSocket until onMessage is registered so the first frames of a fast upstream aren't dropped. - Collapse the dead send branch, hoist TextEncoder/TextDecoder singletons, and correct the LlmWebSocketUpstream.onClose contract doc. - Update CopilotClientOptions.llmInference docs: streaming SSE and WebSocket are intercepted, not bypassed (I6). - Add unit tests: chunk-before-start staging, accepted=false abort, WS upstream-close-first finalisation, and WS upstream-error propagation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Drives a CAPI session and a BYOK (openai/responses) session entirely through the LLM inference callback — the consumer fabricates every model-layer response, so the CAPI record/replay proxy is never the inference endpoint. Asserts each in-session inference request carries req.sessionId === session.sessionId and that the two session ids differ. The mock branches /responses on the request stream flag: BYOK turns whose config-derived model does not advertise streaming issue a buffered (non-streaming) /responses request expecting a single JSON response object, whereas the CAPI turn streams via SSE. This mirrors real upstream behaviour and confirms the callback transport faithfully delivers both shapes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Mirrors the TypeScript LLM inference callback feature in the .NET SDK so consumers can observe/mutate the model-layer HTTP/WebSocket requests the runtime issues (CAPI and BYOK), with the runtime session id threaded into each callback. - scripts/codegen/csharp.ts: emit the clientGlobal handler interface + registration so Rpc.cs gains the llmInference handler surface. - LlmInferenceProvider.cs: low-level ILlmInferenceProvider API + adapter (request staging, response sink state machine) behind an internal ILlmInferenceResponseChannel seam for unit testing. - LlmRequestHandler.cs: idiomatic pass-through base class mapping to HttpRequestMessage/HttpResponseMessage and ClientWebSocket, with virtual transform/forward hooks for both transports. - Types.cs/Client.cs: wire LlmInferenceConfig into the client and register the provider on start. - Tests: factored unit-test infra (recording channel/sink, inline provider, frame builders) with adapter + handler tests, plus CAPI+BYOK e2e tests asserting the session id reaches the callback. e2e provider emits raw JSON (reflection-free STJ) and serves all model-layer traffic off-network. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Hide the redundant low-level provider interface and adapter from the public surface in both SDKs; the sole public extension point is now the LlmRequestHandler base class. Replace the LlmInferenceConfig provider factory with a direct handler instance (the provider is client-global, constructed once with no args). .NET: ILlmInferenceProvider + the LlmInferenceRequest/ResponseInit/ResponseSink DTOs become internal; LlmRequestHandler implements the interface explicitly so OnLlmRequestAsync leaves its public surface. LlmInferenceConfig.Handler replaces the Func<LlmRequestHandler> factory. TS: stop exporting LlmInferenceProvider and createLlmInferenceAdapter from index.ts; LlmInferenceConfig.handler replaces createLlmInferenceProvider. The request/sink DTOs stay exported as onLlmRequest's contract (TS lacks explicit interface implementation). E2E providers become LlmRequestHandler subclasses overriding onLlmRequest. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Collapse the HTTP callback seam to SendRequest/sendRequest, replace websocket hooks with per-connection handlers, and update tests to use the forwarding handler model. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Port the LLM inference callback feature to the Python SDK, mirroring the existing Node.js and .NET implementations. Consumers subclass `LlmRequestHandler` and override `send_request` (idiomatic httpx) for HTTP or `open_web_socket` (websockets) for the WebSocket transport; both default to transparent pass-through. Wired through `LlmInferenceConfig` on the client, registered on the `clientGlobal.llmInference` scope. Adds the low-level provider/adapter, the httpx-based handler base class, client wiring, public exports, and httpx as a core dependency. Extends the Python codegen to emit clientGlobal handler registration and regenerates the generated RPC bindings. Includes 8 e2e test files (10 tests) mirroring the Node.js suite — round trip, session-id threading (CAPI + BYOK), streaming SSE, error mapping, runtime cancel, consumer cancel, WebSocket transport, and the idiomatic handler against a real local HTTP+WebSocket upstream. All pass off-network. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Mirror the existing Node/.NET/Python LLM inference callback support in the Go SDK. Consumers register an LlmInferenceProvider (or the idiomatic LlmRequestHandler over net/http + coder/websocket) via ClientOptions.LlmInference; the runtime routes every model-layer HTTP and WebSocket request through it for both CAPI and BYOK sessions. - Codegen (scripts/codegen/go.ts) now emits the clientGlobal handler registration, regenerating go/rpc/zrpc.go. - New low-level provider types + adapter (llm_inference_provider.go) and the idiomatic forwarding handler (llm_request_handler.go). - Wire LlmInferenceConfig into ClientOptions and the connect/start paths. - 8 off-network e2e scenarios mirroring the other SDKs (basic, session id, stream, errors, cancel, consumer cancel, websocket, handler). Also fixes a pre-existing Go e2e compile break (AttachmentBlob.Data became *string in the Rust contract regen baseline) that blocked the e2e package. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Carry per-frame binary/text through the request body channel via a new CopilotWebSocketMessage type so ForwardingCopilotWebSocketHandler forwards binary frames as WebSocket binary frames instead of always text. - Rename CopilotWebSocketCloseStatus.Code to ErrorCode to match the cross-SDK field naming. - Actually close the request body (defer r.Body.Close()) in the e2e fake upstream instead of discarding the method value. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…sponse - Rename the HTTP intercept hook send_http -> send_request to match the cross-SDK majority (Node.js sendRequest, .NET SendRequestAsync, Python send_request); update doc links and e2e handler impls. - Expand the open_websocket doc comment to explain that, unlike the other SDKs, the consumer must store the CopilotWebSocketResponse argument in the returned handler and call send_message on it (there is no base-class send_response_message helper in the Rust trait). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Rename the HTTP intercept hook sendHttp -> sendRequest (base class + all e2e overrides) to match the cross-SDK majority naming. - Replace deprecated JsonNode.fields() with properties() in parseHeaders. - Guard Integer.parseInt of the Content-Length header in the e2e fake upstream against a malformed value (NumberFormatException). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Document intentional empty-except blocks (upstream close, cancelled-task unwind, helper/test cleanup paths) - Narrow overly-broad `except BaseException` to `except Exception` in the cancel/error e2e test - Remove unused `E2ETestContext` imports from cancel-error and session-id e2e tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Use `using var` so the temporary port-probe listener is disposed (CodeQL flagged the prior Stop()-only path as a leaked IDisposable). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The runtime completes all internal cleanup before responding to the runtime.shutdown RPC and then deliberately keeps only its JSON-RPC server alive to send the response; it never self-exits (callers own termination). Since PR #1667, every client stop() additionally waited up to the 10s runtime-shutdown grace for a child self-exit that by contract never happens, then fell back to terminate/kill anyway. This made every client teardown burn the full grace window, which showed up as ~1 minute-per-test e2e slowness. Drop the post-shutdown self-exit wait in all six SDKs: once the shutdown RPC has completed (or failed), terminate the already cleaned-up child immediately and only wait to reap it. Graceful internal cleanup is unchanged - we still await runtime.shutdown before terminating. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Make CopilotWebSocketHandler the concrete handler that forwards to the upstream Copilot service by default, and introduce CopilotWebSocketHandlerBase as the lower-level abstraction that does no upstream forwarding. Previously the forwarding behavior lived in ForwardingCopilotWebSocketHandler while the base CopilotWebSocketHandler did not forward, which made it non-obvious that overriding the handler required switching to the Forwarding* type to preserve passthrough. Applies the rename across the four object-oriented SDKs (Node, .NET, Java, Python) and updates the corresponding e2e tests. Go and Rust use composition and are unaffected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Aligns the Python CopilotRequestHandler WebSocket hook name with the Rust SDK's open_websocket for cross-language consistency. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions

Generated by SDK Consistency Review Agent for issue #1689 · sonnet46 3.2M

Mirror the Node.js SDK, where createCopilotRequestAdapter is an internal RPC-wiring adapter that is intentionally not re-exported from the package entrypoint. Consumers configure request_handler on CopilotClientOptions and never call the adapter directly; its second parameter also takes an internal generated type, making it unsuitable as a stable public API. The function remains importable from copilot.copilot_request_handler (as client.py already does); only the top-level copilot namespace and __all__ entries are removed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions

Generated by SDK Consistency Review Agent for issue #1689 · sonnet46 4.4M

…age helpers Addresses two cross-SDK consistency gaps on the Go request-handler API flagged in review: - Widen OnSendRequestMessage / OnSendResponseMessage from func([]byte) []byte to func(CopilotWebSocketMessage) *CopilotWebSocketMessage so callbacks can inspect and change a frame's text/binary type, matching the CopilotWebSocketMessage-based hooks in the .NET, Rust, and Java SDKs. Returning nil still drops the frame, preserving existing semantics. - Add Text(), NewTextMessage(), and NewBinaryMessage() convenience helpers to CopilotWebSocketMessage, mirroring the factory/getter helpers the other strongly-typed SDKs provide. This is a new experimental API, so aligning the shape now avoids a later breaking change. Updates the lone internal call site in the e2e handler test. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions

Generated by SDK Consistency Review Agent for issue #1689 · sonnet46 3.2M

…bSocketForwarder Unifies the WebSocket interception handler naming across all six SDKs so the protocol/contract is always CopilotWebSocketHandler and the default forwarding implementation is always CopilotWebSocketForwarder. - OO SDKs (Node, .NET, Java, Python): abstract CopilotWebSocketHandlerBase -> CopilotWebSocketHandler (protocol); concrete CopilotWebSocketHandler -> CopilotWebSocketForwarder (forwarder). Java files renamed accordingly. - Go/Rust: concrete ForwardingCopilotWebSocketHandler -> CopilotWebSocketForwarder (plus constructor/builder); the CopilotWebSocketHandler interface/trait is unchanged. Also rename the Rust CopilotRequestTransport::Websocket enum variant to WebSocket for proper-noun consistency. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions

Generated by SDK Consistency Review Agent for issue #1689 · sonnet46 3M

github-actions · 2026-06-23T09:07:59Z

+
+/// A buffered HTTP request handed to [`CopilotRequestHandler::send_request`].
+#[non_exhaustive]
+pub struct CopilotHttpRequest {


PR description vs. implementation mismatch — custom types, not http::Request/http::Response

The PR description's comparison table lists Rust's HTTP type as http::Request / http::Response (the standard http crate types). The actual implementation uses SDK-defined CopilotHttpRequest / CopilotHttpResponse wrappers.

Every other SDK hands consumers the ecosystem's native HTTP request/response types so they compose naturally:

SDK Request type Response type

Node.js Request (Fetch API) Response

Python httpx.Request httpx.Response

.NET HttpRequestMessage HttpResponseMessage

Go *http.Request (via RoundTripper) *http.Response

Java HttpRequest (java.net.http) HttpResponse<InputStream>

Rust CopilotHttpRequest (custom) CopilotHttpResponse (custom)

Using custom types is a reasonable trade-off (avoids the B: Body generic parameter dance on http::Request<B>), but it does mean Rust consumers can't pass the intercepted request directly to a reqwest::Client or any other tower::Service<http::Request> middleware. Two small suggestions:

Update the PR description table to say "SDK-specific CopilotHttpRequest / CopilotHttpResponse" rather than the standard crate types.

Consider adding a From<CopilotHttpRequest> for reqwest::Request conversion (or at least the reverse) so consumers who want to forward to a custom backend have a clean path without reconstructing the whole request manually.

The handler rename shortened a javadoc line, so Spotless' wrapping no longer matched. Ran spotless:apply (javadoc reflow only, no semantic change). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-06-23T09:19:53Z

Cross-SDK Consistency Review ✅

This PR adds HTTP/WebSocket request callback support to all six SDK implementations simultaneously. Here's the cross-language consistency assessment:

API surface — all SDKs aligned

Concept	Node.js	Python	Go	.NET	Java	Rust
Handler class/type	`CopilotRequestHandler`	`CopilotRequestHandler`	`CopilotRequestHandler`	`CopilotRequestHandler`	`CopilotRequestHandler`	`CopilotRequestHandler` trait
HTTP hook	`sendRequest(Request, ctx)`	`send_request(httpx.Request, ctx)`	`Transport http.RoundTripper`	`SendRequestAsync(HttpRequestMessage, ctx)`	`sendRequest(HttpRequest, ctx)`	`send_request(CopilotHttpRequest, ctx)`
WebSocket hook	`openWebSocket(ctx)`	`open_websocket(ctx)`	`OpenWebSocket func(ctx)`	`OpenWebSocketAsync(ctx)`	`openWebSocket(ctx)`	`open_websocket(ctx, response)`
Options field	`requestHandler`	`request_handler`	`RequestHandler`	`RequestHandler`	`requestHandler`	`request_handler`
Forwarder	`CopilotWebSocketForwarder`	`CopilotWebSocketForwarder`	`CopilotWebSocketForwarder`	`CopilotWebSocketForwarder`	`CopilotWebSocketForwarder`	`CopilotWebSocketForwarder`

E2E test coverage — all SDKs present

All six SDKs ship e2e tests covering: handler error surfacing, runtime-driven cancellation, HTTP+WebSocket forwarding, and session ID threading.

Intentional design deviations (documented and appropriate)

Go uses a struct with function fields (Transport http.RoundTripper, OpenWebSocket func(...)) instead of class inheritance — idiomatic Go (no virtual methods). HTTP body arrives in CopilotRequestContext via a channel, then flows through the RoundTripper as a normal *http.Request.
Rust passes the CopilotWebSocketResponse writer as a parameter to open_websocket rather than exposing send_response_message on the handler class — this is documented inline as a known difference and is the idiomatic Rust ownership model. Go follows the same injection pattern via Open(ctx, WebSocketResponseWriter).
Go's CopilotWebSocketForwarder exposes mutation via hook function fields (OnSendRequestMessage, OnSendResponseMessage) rather than method overrides (as in OO languages) — idiomatic Go composition over inheritance.
Cancellation uses each language's native mechanism: AbortSignal (Node.js), asyncio.Event (Python), context.Context (Go), CancellationToken (Rust/.NET), CompletableFuture<Void> (Java) — all correct and idiomatic.

No consistency issues found

All differences across SDKs are deliberate language-idiomatic choices. The feature is implemented with good parity: same semantics, same public API shape, same test scenarios in every language.

Generated by SDK Consistency Review Agent for issue #1689 · sonnet46 2.6M · ◷

The body channel is internal framework plumbing: the adapter drains it for HTTP requests and pumps it to CopilotWebSocketHandler.SendRequestMessage for WebSocket requests. It was exported only because Go exports by capitalisation; no other SDK surfaces the body channel on the request context. A consumer reading it directly (e.g. via RequestContextFrom) would race the adapter's pump goroutine and lose frames. Lowercase it to body to match the other SDKs and keep it internal to the adapter layer. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

sessionId() returns null when the request was issued outside any session, but the nullability was only documented in the Javadoc. Every other SDK makes this explicit in the type system (TypeScript string?, Python str | None, .NET string?, Rust Option<String>). Add @nullable from the already-declared spotbugs-annotations dependency on the field, constructor parameter, and getter so SpotBugs and IDEs surface null-safety warnings, aligning Java with the other SDKs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

forceStop()/stop() dispose the JSON-RPC connection and destroy the underlying socket. If vscode-jsonrpc still has an in-flight write at that moment — most commonly the auto-generated response to a server->client request (a tool, hook, userInput, or LLM-inference handler that resolved just before teardown) — the write rejects with ERR_STREAM_DESTROYED. That response write is internal to vscode-jsonrpc and awaited by nobody, so the rejection surfaces as an unhandled promise rejection (observed intermittently in the e2e suite, originating from pending_work_resume's cold-resume forceStop path, but reproducible by any consumer that forceStop()s while a server->client request is in flight). Wrap the StreamMessageWriter so write failures can be swallowed, but only once a teardown-in-progress flag is set immediately before connection.dispose(). The writer still fires its error event (forwarded to MessageConnection.onError) and dispose() still rejects pending requests, so no signal is lost. Outside teardown the flag stays false, so write failures propagate normally and in-flight requests continue to fail fast. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

SteveSandersonMS changed the title ~~Simplify LLM inference callback handlers~~ LLM inference callback support Jun 16, 2026

SteveSandersonMS force-pushed the stevesandersonms/llm-inference-callbacks branch from 815bbd0 to 7bc95c0 Compare June 19, 2026 15:00

This comment has been minimized.

Sign in to view

github-advanced-security AI found potential problems Jun 19, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

SteveSandersonMS commented Jun 22, 2026

View reviewed changes

Comment thread dotnet/src/Types.cs Outdated

SteveSandersonMS commented Jun 22, 2026

View reviewed changes

Comment thread dotnet/src/GitHub.Copilot.SDK.csproj Outdated

SteveSandersonMS commented Jun 22, 2026

View reviewed changes

Comment thread dotnet/src/LlmInferenceProvider.cs Outdated

SteveSandersonMS commented Jun 22, 2026

View reviewed changes

Comment thread dotnet/src/LlmInferenceProvider.cs Outdated

SteveSandersonMS commented Jun 22, 2026

View reviewed changes

Comment thread dotnet/src/LlmInferenceProvider.cs Outdated

stevesa and others added 18 commits June 22, 2026 14:38

Refine LLM inference callback handlers

c70adeb

Collapse the HTTP callback seam to SendRequest/sendRequest, replace websocket hooks with per-connection handlers, and update tests to use the forwarding handler model. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

SteveSandersonMS and others added 2 commits June 22, 2026 21:11

This comment has been minimized.

Sign in to view

SteveSandersonMS and others added 3 commits June 22, 2026 21:41

Dispose TcpListener probe in .NET e2e GetFreePort helper

6180a8a

Use `using var` so the temporary port-probe listener is disposed (CodeQL flagged the prior Stop()-only path as a leaked IDisposable). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

SteveSandersonMS force-pushed the stevesandersonms/llm-inference-callbacks branch from 6b14a5f to 6180a8a Compare June 22, 2026 20:41

This comment has been minimized.

Sign in to view

SteveSandersonMS and others added 3 commits June 22, 2026 22:57

Python: rename open_web_socket to open_websocket to match Rust

130d119

Aligns the Python CopilotRequestHandler WebSocket hook name with the Rust SDK's open_websocket for cross-language consistency. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

SteveSandersonMS force-pushed the stevesandersonms/llm-inference-callbacks branch from 9f2a649 to 130d119 Compare June 22, 2026 21:57

This comment has been minimized.

Sign in to view

github-actions Bot reviewed Jun 22, 2026

View reviewed changes

Comment thread python/copilot/__init__.py Outdated

SteveSandersonMS changed the title ~~LLM inference callback support~~ HTTP request callback support Jun 22, 2026

This comment has been minimized.

Sign in to view

github-actions Bot reviewed Jun 22, 2026

View reviewed changes

Comment thread go/copilot_request_handler.go Outdated

Comment thread go/copilot_request_handler.go

SteveSandersonMS mentioned this pull request Jun 22, 2026

Code-generate inbound (server→client) RPC dispatch for the Rust SDK #1764

Open

4 tasks

This comment has been minimized.

Sign in to view

github-actions Bot reviewed Jun 22, 2026

View reviewed changes

Comment thread nodejs/src/copilotRequestHandler.ts Outdated

Comment thread rust/src/copilot_request_handler.rs Outdated

This comment has been minimized.

Sign in to view

github-actions Bot reviewed Jun 23, 2026

View reviewed changes

Reflow CopilotWebSocketForwarder javadoc to satisfy Spotless

2cbec1f

The handler rename shortened a javadoc line, so Spotless' wrapping no longer matched. Ran spotless:apply (javadoc reflow only, no semantic change). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

SteveSandersonMS and others added 3 commits June 23, 2026 10:21

SDK	Request type	Response type
Node.js	`Request` (Fetch API)	`Response`
Python	`httpx.Request`	`httpx.Response`
.NET	`HttpRequestMessage`	`HttpResponseMessage`
Go	`*http.Request` (via RoundTripper)	`*http.Response`
Java	`HttpRequest` (java.net.http)	`HttpResponse<InputStream>`
Rust	`CopilotHttpRequest` (custom)	`CopilotHttpResponse` (custom)

Conversation

SteveSandersonMS commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Shared protocol and plumbing

Per-language ports

API shape

Usage examples

C#

Node.js

Java

Tests

Uh oh!

This comment has been minimized.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-advanced-security AI left a comment

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

This comment has been minimized.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

This comment has been minimized.

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 23, 2026

Cross-SDK Consistency Review ✅

API surface — all SDKs aligned

E2E test coverage — all SDKs present

Intentional design deviations (documented and appropriate)

No consistency issues found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SteveSandersonMS commented Jun 16, 2026 •

edited

Loading